Cepstral analysis synthesis on the mel frequency scale
نویسنده
چکیده
Psychophysical studies have shown that human perception of the frequency content of sounds, either for pure tones or for speech signals, does not follow a linear scale. This research has led to the idea of defining subjective pitch of pure tones. Thus for each tone with an actual frequency, f, measured in Hz, a subjective pitch is measured on a scale called ''Mel'' scale. As a reference point, the pitch of a 1kHz tone, 40 dB above the perceptual hearing threshold, is defined as 1000 mels. Other subjective pitch values are obtained by adjusting the frequency of a tone such that it is half or twice the perceived pitch of a reference tone (with a known mel frequency).
منابع مشابه
Perceptual Significance of Cepstral Distortion Measures in Digital Speech Processing
Currently, one of the most widely used distance measures in speech and speaker recognition is the Euclidean distance between mel frequency cepstral coefficients (MFCC). MFCCs are based on filter bank algorithm whose filters are equally spaced on a perceptually motivated mel frequency scale. The value of mel cepstral vector, as well as the properties of the corresponding cepstral distance, are d...
متن کاملMel Frequency Cepstral Coefficients for Music Modeling
We examine in some detail Mel Frequency Cepstral Coefficients (MFCCs) the dominant features used for speech recognition and investigate their applicability to modeling music. In particular, we examine two of the main assumptions of the process of forming MFCCs: the use of the Mel frequency scale to model the spectra; and the use of the Discrete Cosine Transform (DCT) to decorrelate the Mel-spec...
متن کاملFrequency-warping in speech
In this paper we present results that indicate that the formant frequencies between di erent speakers scale di erently at di erent frequencies. Based on our experiments on speech data, we then numerically compute a universal frequencywarping function, to make the scale-factor independent of frequency in the warped domain. The proposed warping function is found to be similar to the mel-scale, wh...
متن کاملDetermining the effective features in classification of heart sounds using trained intelligent network and genetic algorithm
Heart diseases are among the most important causes of mortality in the world, especially in industrial countries. Using heart sounds and the features extracted from them are among the non-aggressive diagnosis and prognosis methods for heart diseases. In this study, the time-scale, Cepstral, frequency, temporal and turbulence features are saved and extracted from the heart sounds, and then they ...
متن کاملCombining evidences from mel cepstral, cochlear filter cepstral and instantaneous frequency features for detection of natural vs. spoofed speech
Speech synthesis and voice conversion techniques can pose threats to current speaker verification (SV) systems. For this purpose, it is essential to develop front end systems that are able to distinguish human speech vs. spoofed speech (synthesized or voice converted). In this paper, for the ASVspoof 2015 challenge, we propose a detector based on combination of cochlear filter cepstral coeffici...
متن کامل